The data used is a dataframe where each row is a player and each column is the amount of shots attempted by position in the court during 2018-2019(one column per position, with more than 500 spots available).
Non-Negative Matrix Factorization helps understand the underlying structure of our matrix of data by decomposing it into two matrices: W and H, under the constraint that all values are positives.
Choosing a lower Rank for the NNMF than the original data has, leads us to a non perfect reconstruction but we are happy with that as long as we gain in structure understanding. We decided to use 15 patterns in this case.
Shooting data by player and position on the court can be expressed as groups of players having different shooting patterns. NNMF gives us that. To the left we have the 15 different patterns discovered in the data. Below how much each player shoots from each of them.
Appendix
\[ A = WH \] \[ \min W,H \text{ }\text{ } ||A - WH||^2_F \text{ }\text{ } s.t. \text{ }\text{ } W \ge 0, H \ge 0\]
---
title: "NNMF Decomposition of the NBA shots 2018-2019"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
theme: lumen
source: embed
---
```{r setup, include=FALSE}
library(tidyverse)
library(flexdashboard)
library(patchwork)
library(janitor)
library(plotly)
source("functions.R")
bins = readRDS("bins.rds")
row_byplayer = readRDS("row_byplayer.rds") %>%
distinct(playerId,.keep_all = T)
plot_explained = readRDS("plot_explained.rds")
plot_explained_mod = plot_explained +
theme(panel.border = element_blank(),
axis.text.y = element_text(size = 6)) +
# Sacarle blanco a la izquierda del chart
xlab("Rank") +
ylab("") +
NULL
# NMF ----
nmf = readRDS("nmf.rds")
w = NMF::basis(nmf)
h = NMF::coef(nmf)
structure = t(h) %>%
as.data.frame() %>%
cbind.data.frame(bins)
player_nmf = cbind.data.frame(row_byplayer, w) %>%
clean_names() %>%
select(-player_id) %>%
mutate_if(is.numeric, function(x) {x*1000})
plots = vector(mode = "list")
for (i in 1:14){
plots[[i]] = plot_structure(structure, as.name(glue::glue("V",i))) + guides(colour = "none")
}
plots[[15]] = plot_structure(structure, as.name("V15"))
```
Column {data-width=600}
-----------------------------------------------------------------------
### SHOT PATTERNS {.no-padding}
```{r, fig.height=10, fig.width=10}
wrap_plots(plots, ncol = 3, nrow = 5) &
theme(legend.position = 'bottom')
```
Column {data-width=400 }
-----------------------------------------------------------------------
### INTRO
* The data used is a dataframe where each row is a player and each column is the amount of shots attempted by position in the court during 2018-2019(one column per position, with more than 500 spots available).
* Non-Negative Matrix Factorization helps understand the underlying structure of our matrix of data by decomposing it into two matrices: W and H, under the constraint that all values are positives.
* Choosing a lower Rank for the NNMF than the original data has, leads us to a non perfect reconstruction but we are happy with that as long as we gain in structure understanding. We decided to use 15 patterns in this case.
* Shooting data by player and position on the court can be expressed as groups of players having different shooting patterns. NNMF gives us that. To the left we have the 15 different patterns discovered in the data. Below how much each player shoots from each of them.
*Appendix*
$$ A = WH $$
$$ \min W,H \text{ }\text{ } ||A - WH||^2_F \text{ }\text{ } s.t. \text{ }\text{ } W \ge 0, H \ge 0$$
### PATTERN WEIGHT BY PLAYER (ONE PER EACH OF THE 15 PATTERNS)
```{r}
DT::datatable(player_nmf, options=list(scrollX=T, pageLength = 7 )) %>%
DT::formatCurrency(c(4:18),currency = "", digits = 0)
```